AITopics | minimax risk

Collaborating Authors

minimax risk

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Blind Attacks on Machine Learners

Alex Beatson, Zhaoran Wang, Han Liu

Neural Information Processing SystemsMar-23-2026, 20:51:29 GMT

The importance of studying the robustness of learners to malicious data is well established. While much work has been done establishing both robust estimators and effective data injection attacks when the attacker is omniscient, the ability of an attacker to provably harm learning while having access to little information is largely unstudied. We study the potential of a "blind attacker" to provably limit a learner's performance by data injection attack without observing the learner's training set or any parameter of the distribution from which it is drawn. We provide examples of simple yet effective attacks in two settings: firstly, where an "informed learner" knows the strategy chosen by the attacker, and secondly, where a "blind learner" knows only the proportion of malicious data and some family to which the malicious distribution chosen by the attacker belongs. For each attack, we analyze minimax rates of convergence and establish lower bounds on the learner's minimax risk, exhibiting limits on a learner's ability to learn under data injection attack even when the attacker is "blind".

data mining, learner, machine learning, (17 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

Total Variation Classes Beyond 1d: Minimax Rates, and the Limitations of Linear Smoothers

Veeranjaneyulu Sadhanala, Yu-Xiang Wang, Ryan J. Tibshirani

Neural Information Processing SystemsMar-23-2026, 02:34:28 GMT

We consider the problem of estimating a function defined over nlocations on a d-dimensional grid (having all side lengths equal to n1/d). When the function is constrained to have discrete total variation bounded by Cn, we derive the minimax optimal (squared) `2 estimation error rate, parametrized by n,Cn. Total variation denoising, also known as the fused lasso, is seen to be rate optimal. Several simpler estimators exist, such as Laplacian smoothing and Laplacian eigenmaps. A natural question is: can these simpler estimators perform just as well?

artificial intelligence, estimator, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback

Information-theoretic Limits of Online Classification with Noisy Labels

Neural Information Processing SystemsMar-22-2026, 13:56:14 GMT

We study online classification with general hypothesis classes where the true labels are determined by some function within the class, but are corrupted by stochastic noise, and the features are generated adversarially. Predictions are made using observed labels and noiseless features, while the performance is measured via minimax risk when comparing against labels. The noisy mechanism is modeled via a general noisy kernel that specifies, for any individual data point, a set of distributions from which the actual noisy label distribution is chosen. We show that minimax risk is characterized (up to a logarithmic factor of the hypothesis class size) by the of the noisy label distributions induced by the kernel, of other properties such as the means and variances of the noise. Our main technique is based on a novel reduction to an online comparison scheme of two hypotheses, along with a new version of Le Cam-Birgé testing suitable for online settings. Our work provides the first comprehensive characterization of noisy online classification with guarantees that apply to the while addressing noisy observations.

artificial intelligence, name change, proceedings, (6 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (1.00)

Technology: Information Technology > Artificial Intelligence (0.64)

Add feedback

Information-theoretic Limits of Online Classification with Noisy Labels

Neural Information Processing SystemsFeb-18-2026, 05:21:59 GMT

Learning from noisy data is a fundamental problem in many machine learning applications.

artificial intelligence, kernel, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Hawaii (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

b4dde7f1bc45bf9c0fda8db8f272b758-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 16:34:24 GMT

artificial intelligence, machine learning, mechanism, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > France (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

f75b757d3459c3e93e98ddab7b903938-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 04:22:06 GMT

covariate, estimation, estimator, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada (0.04)

Genre: Research Report > Experimental Study (0.49)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.44)

Add feedback

MinimaxClassificationwith0-1Loss andPerformanceGuarantees

Neural Information Processing SystemsFeb-7-2026, 07:55:11 GMT

Supervised classification techniques use training samples to find classification ruleswithsmallexpected0-1loss.

artificial intelligence, classification rule, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > Spain > Basque Country > Biscay Province > Bilbao (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.97)

Add feedback

Towards Sharp Minimax Risk Bounds for Operator Learning

Adcock, Ben, Maier, Gregor, Parhi, Rahul

arXiv.org Machine LearningDec-22-2025

A new paradigm in machine learning for scientific computing is focused on designing learning algorithms and methods for continuum problems. This paradigm is referred to as operator learning and has received considerable interest in the last few years [5,7,18,20,23-25,27,30,34,36]. The basic task may be posed as learning a map between infinite-dimensional function spaces, i.e., learning an operator F: X Y, where, for example, X and Y are real, separable Hilbert spaces. Operator learning naturally arises in many scientific problems where one wants to learn how a continuum model, often described by partial differential equations (PDEs), maps inputs, such as parameters or boundary conditions, to outputs, such as states or observables. A prototypical example to keep in mind is learning parameter-to-solution maps of parametric PDEs [1,2,11]. In contrast to more classical surrogate modeling, which typically focuses on learning finite-dimensional parameter-to-solution maps for some fixed discretization, operator learning directly aims to learn/approximate the continuum map F: X Y itself. Thus, the inputs and outputs are functions (not vectors) and the goal is to directly design discretization-invariant methods [7,23]. From a statistical perspective, this naturally leads to a nonparametric regression problem in which both the object of interest (the operator) and the observations (finite number of noisy samples) are infinite-dimensional.

assumption 2, eigenvalue, operator, (16 more...)

arXiv.org Machine Learning

2512.17805

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Learning the score under shape constraints

Lewis, Rebecca M., Feng, Oliver Y., Reeve, Henry W. J., Xu, Min, Samworth, Richard J.

arXiv.org Machine LearningDec-17-2025

Score estimation has recently emerged as a key modern statistical challenge, due to its pivotal role in generative modelling via diffusion models. Moreover, it is an essential ingredient in a new approach to linear regression via convex $M$-estimation, where the corresponding error densities are projected onto the log-concave class. Motivated by these applications, we study the minimax risk of score estimation with respect to squared $L^2(P_0)$-loss, where $P_0$ denotes an underlying log-concave distribution on $\mathbb{R}$. Such distributions have decreasing score functions, but on its own, this shape constraint is insufficient to guarantee a finite minimax risk. We therefore define subclasses of log-concave densities that capture two fundamental aspects of the estimation problem. First, we establish the crucial impact of tail behaviour on score estimation by determining the minimax rate over a class of log-concave densities whose score function exhibits controlled growth relative to the quantile levels. Second, we explore the interplay between smoothness and log-concavity by considering the class of log-concave densities with a scale restriction and a $(β,L)$-Hölder assumption on the log-density for some $β\in [1,2]$. We show that the minimax risk over this latter class is of order $L^{2/(2β+1)}n^{-β/(2β+1)}$ up to poly-logarithmic factors, where $n$ denotes the sample size. When $β< 2$, this rate is faster than could be obtained under either the shape constraint or the smoothness assumption alone. Our upper bounds are attained by a locally adaptive, multiscale estimator constructed from a uniform confidence band for the score function. This study highlights intriguing differences between the score estimation and density estimation problems over this shape-constrained class.

estimation, score estimation, score function, (17 more...)

arXiv.org Machine Learning

2512.14624

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Oceania > New Zealand (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Performance Guarantees for Quantum Neural Estimation of Entropies

Sreekumar, Sreejith, Goldfeld, Ziv, Wilde, Mark M.

arXiv.org Artificial IntelligenceNov-25-2025

Estimating quantum entropies and divergences is an important problem in quantum physics, information theory, and machine learning. Quantum neural estimators (QNEs), which utilize a hybrid classical-quantum architecture, have recently emerged as an appealing computational framework for estimating these measures. Such estimators combine classical neural networks with parametrized quantum circuits, and their deployment typically entails tedious tuning of hyperparameters controlling the sample size, network architecture, and circuit topology. This work initiates the study of formal guarantees for QNEs of measured (Rényi) relative entropies in the form of non-asymptotic error risk bounds. We further establish exponential tail bounds showing that the error is sub-Gaussian, and thus sharply concentrates about the ground truth value. For an appropriate sub-class of density operator pairs on a space of dimension $d$ with bounded Thompson metric, our theory establishes a copy complexity of $O(|Θ(\mathcal{U})|d/ε^2)$ for QNE with a quantum circuit parameter set $Θ(\mathcal{U})$, which has minimax optimal dependence on the accuracy $ε$. Additionally, if the density operator pairs are permutation invariant, we improve the dimension dependence above to $O(|Θ(\mathcal{U})|\mathrm{polylog}(d)/ε^2)$. Our theory aims to facilitate principled implementation of QNEs for measured relative entropies and guide hyperparameter tuning in practice.

artificial intelligence, entropy, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.19289

Country: North America > United States (0.67)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback